Search CORE

30 research outputs found

Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming

Author: Espinosa Allan
Foster Ian
Iskra Kamil
Raicu Ioan
Wilde Michael
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/12/2008
Field of study

Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.Comment: IEEE Many-Task Computing on Grids and Supercomputers (MTAGS08) 200

arXiv.org e-Print Archive

Crossref

Towards Loosely-Coupled Programming on Petascale Systems

Author: Beckman Pete
Clifford Ben
Foster Ian
Iskra Kamil
Raicu Ioan
Wilde Mike
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/08/2008
Field of study

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.Comment: IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing/SC) 200

arXiv.org e-Print Archive

Crossref

Performance Analysis of a Parallel Discrete Model for the Simulation of Laser Dynamics

Author: Fernández de Vega Francisco
Guisado Lízar José Luís
Iskra Kamil A.
Publication venue: IEEE Computer Society
Publication date: 01/01/2006
Field of study

This paper presents an analysis on the performance of a parallel implementation of a discrete model of laser dynamics, which is based on cellular automata. The performance of a 2D parallel version of the model is studied as a rst step to test the feasibility of a parallel 3D version, which is needed to simulate speci c laser systems. The 3D version will have to run on a parallel computer due to its runtime and memory requirements. The model has been implemented on a Beowulf Cluster using the message passing paradigm. The parallel implementation is found to exhibit a good speedup, allowing us to run realistic simulations of laser systems on clusters of workstations, which could not be afforded on an individual machine due to the extensive runtime and memory size needed.Ministerio de Educación y Ciencia TIC2002-04498-C05-0

idUS. Depósito de Investigación Universidad de Sevilla

Parallel implementation of a cellular automaton model for the simulation of laser dynamics

Author: Fernández de Vega Francisco
Guisado Lízar José Luís
Iskra Kamil A.
Jiménez-Morales Francisco de Paula
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

A parallel implementation for distributed-memory MIMD systems of a 2D discrete model of laser dynamics based on cellular au- tomata is presented. The model has been implemented on a PC cluster using a message passing library. A good performance has been obtained, allowing us to run realistic simulations of laser systems in clusters of workstations, which could not be a orded on an individual machine due to the extensive runtime and memory size needed.Ministerio de Educación y Ciencia TIN2005-08818-C04-0

idUS. Depósito de Investigación Universidad de Sevilla

Parallel Cellular Automata-based Simulation of Laser Dynamics using Dynamic Load Balancing

Author: Fernández de Vega Francisco
Guisado Lízar José Luís
Iskra Kamil A.
Jiménez-Morales Francisco de Paula
Sloot Peter M.A.
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2008
Field of study

We present an analysis of the feasibility of executing a parallel bioinspired model of laser dynamics, based on cellular automata (CA), on the usual target platform of this kind of applications: a heterogeneous non-dedicated cluster. As this model employs a synchronous CA, using the single program, multiple data (SPMD) paradigm, it is not clear in advance if an appropriate efficiency can be obtained on this kind of platform. We have evaluated its performance including artificial load to simulate other tasks or jobs submitted by other users. A dynamic load balancing strategy with two main differences from most previous implementations of CA based models has been used. First, it is possible to migrate load to cluster nodes initially not belonging to the pool. Second, a modular approach is taken in which the model is executed on top of a dynamic load balancing tool – the Dynamite system – gaining flexibility. Very satisfactory results have been obtained, with performance increases from 60% to 80%.Ministerio de Ciencia e Innovación TIN2007-68083-C02Junta de Extremadura PRI06A22

idUS. Depósito de Investigación Universidad de Sevilla

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Author: Beckman Pete
Bicer Tekin
Iskra Kamil
Jin Sian
Sun Baixi
Tao Dingwen
Tian Jiannan
Yu Xiaodong
Zhang Chengming
Zhou Tao
Publication venue
Publication date: 03/11/2022
Field of study

CNN-based surrogates have become prevalent in scientific applications to replace conventional time-consuming physical approaches. Although these surrogates can yield satisfactory results with significantly lower computation costs over small training datasets, our benchmarking results show that data-loading overhead becomes the major performance bottleneck when training surrogates with large datasets. In practice, surrogates are usually trained with high-resolution scientific data, which can easily reach the terabyte scale. Several state-of-the-art data loaders are proposed to improve the loading throughput in general CNN training; however, they are sub-optimal when applied to the surrogate training. In this work, we propose SOLAR, a surrogate data loader, that can ultimately increase loading throughput during the training. It leverages our three key observations during the benchmarking and contains three novel designs. Specifically, SOLAR first generates a pre-determined shuffled index list and accordingly optimizes the global access order and the buffer eviction scheme to maximize the data reuse and the buffer hit rate. It then proposes a tradeoff between lightweight computational imbalance and heavyweight loading workload imbalance to speed up the overall training. It finally optimizes its data access pattern with HDF5 to achieve a better parallel I/O throughput. Our evaluation with three scientific surrogates and 32 GPUs illustrates that SOLAR can achieve up to 24.4X speedup over PyTorch Data Loader and 3.52X speedup over state-of-the-art data loaders.Comment: 14 pages, 15 figures, 5 tables, submitted to VLDB '2

arXiv.org e-Print Archive

Laser Dynamics Modelling and Simulation: An application of Dynamic Load Balancing of Parallel Cellular Automata

Author: Fernández de Vega Francisco
Guerra Pérez José Manuel
Guisado Lízar José Luís
Iskra Kamil A.
Jiménez-Morales Francisco de Paula
Lombraña González Daniel
Sloot Peter M.A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

idUS. Depósito de Investigación Universidad de Sevilla

The inferior gluteal artery anatomy: a detailed analysis with implications for plastic and reconstructive surgery

Author: Bonczar Michał
Gabryszuk Kamil
Gliwa Jakub
Iskra Tomasz
Koziej Mateusz
Kłosiński Michał
Ostrowski Patryk
Walocha Jerzy
Wojciechowski Wadim
Yika Alicia del Carmen
Publication venue: 'VM Media SP. zo.o VM Group SK'
Publication date: 03/09/2015
Field of study

Background: The inferior gluteal artery (IGA) is a large terminal branch of the anterior division of the internal iliac artery (ADIIA). There is a significant lack of data regarding the variable anatomy of the IGA. Materials and methods: A retrospective study was conducted to establish anatomical variations, their prevalence and morphometrical data on IGA and its branches. The results of 75 consecutive patients who underwent pelvic computed tomography angiography (CTA) were analyzed. Results: The origin variation of each IGA was deeply analyzed. Four origin variations have been observed. The most common Type O1 occurred in 86 of the studied cases (62.3%). The median IGA length was set to be 68.50 mm (LQ = 54.29 ; HQ = 86.06). The median distance from the origin of the ADIIA to the origin of the IGA was set to be 38.22 mm (LQ = 20.22; HQ = 55.97). The median origin diameter of the IGA was established at 4.69 mm (LQ = 4.13; HQ = 5.45). Conclusions: The present study thoroughly analyzed the complete anatomy of the IGA and the branches of the ADIIA. A novel classification system for the origin of the IGA was created, where the most prevalent origin was from the ADIIA (Type 1; 62.3%). Furthermore, the morphometric properties (such as the diameter and length) of the branches of the ADIIA were analyzed. This data may be incredibly useful for physicians performing operations in the pelvis, such as interventional intraarterial procedures or various gynecological surgeries

Via Medica Journals